Towards A Dependency-Based Gold Standard For German Parsers: The TIGER Dependency Bank
نویسندگان
چکیده
In this paper we discuss the construction, features and intended uses of the TiGer DB. The TiGer DB is a dependency bank derived from the TiGer Treebank containing predicate-argument relations and several grammatical features which can be considered as semantically meaningful. It is produced semi-automatically by the conversion of the TiGer treebank into an LFG f-structure bank, which then in turn is converted into the TiGer DB. This allows for a relatively rapid construction. The grammatical relations and features encoded in the TiGer DB are chosen in order to keep the mapping from parser output, e.g. LFG f-structures or HPSG feature structures, to dependency triples simple. Hence, the TiGer DB can be used as a gold standard for the evaluation of German parsers.
منابع مشابه
On Representing Dependency Relations – Insights from Converting the German TiGerDB
Research in parser evaluation has led to the creation of dependency resources such as the TiGer Dependency Bank, a semi-automatic conversion of a subset of the TIGER Treebank. We explore the relationship between the TiGerDB representation and a more surface-oriented dependency analysis of German and describe how we mapped and recoded the TiGerDB into a format more closely linked to the original...
متن کاملAn Out-of-Domain Test Suite for Dependency Parsing of German
We present a dependency conversion of five German test sets from five different genres. The dependency representation is made as similar as possible to the dependency representation of TiGer, one of the two big syntactic treebanks of German. The purpose of these test sets is to enable researchers to test dependency parsing models on several different data sets from different text genres. We dis...
متن کاملTreebank-Based Acquisition of Multilingual Unification Grammar Resources
Deep unification(constraint-)based grammars are usually hand-crafted. Scaling such grammars from fragments to unrestricted text is time-consuming and expensive. This problem can be exacerbated in multilingual broad-coverage grammar development scenarios. Cahill et al. (2002, 2004) and O’Donovan et al. (2004) present an automatic f-structure annotation-based methodology to acquire broad-coverage...
متن کاملDCU 250 Arabic Dependency Bank: An LFG Gold Standard Resource for the Arabic Penn Treebank
This paper describes the construction of a dependency bank gold standard for Arabic, DCU 250 Arabic Dependency Bank (DCU 250), based on the Arabic Penn Treebank Corpus (ATB) (Bies and Maamouri, 2003; Maamouri and Bies, 2004) within the theoretical framework of Lexical Functional Grammar (LFG). For parsing and automatically extracting grammatical and lexical resources from treebanks, it is neces...
متن کاملPreliminary Experiments in Polish Dependency Parsing
Preliminary experiments presented in this paper consist in the induction and evaluation of a dependency parser for Polish. We train data-driven dependency models with publicly available parser-generation systems (MaltParser and MSTParser) given a converted dependency structure bank for Polish. Induced Polish dependency parsers are evaluated against a set of gold standard dependency structures u...
متن کامل